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Abstract 

Context. Herschel and Planck are surveying the sky at unprecedented angular scales and sensitivities over large areas. 
But both experiments are limited by source confusion in the submillimeter. The high confusion noise in particular 
restricts the study of the clustering properties of the sources that dominate the cosmic infrared background. At these 
wavelengths, it is more appropriate to consider the statistics of the unresolved component. In particular, high clustering 
will contribute in excess of Poisson noise in the power spectra of CIB anisotropies. 

Aims. These power spectra contain contributions from sources at all redshift. We show how the stacking technique can 
be used to separate the different redshift contributions to the power spectra. 

Methods. We use simulations of CIB representative of realistic Spitzer, Herschel, Planck, and SCUBA-2 observations. We 
stack the 24 fim sources in longer wavelengths maps to measure mean colors per redshift and flux bins. The information 
retrieved on the mean spectral energy distribution obtained with the stacking technique is then used to clean the maps, 
in particular to remove the contribution of low-redshift undetected sources to the anisotropies. 

Results. Using the stacking, we measure the mean flux of populations 4 to 6 times fainter than the total noise at 350 /xm 
at redshifts z = 1 and z = 2, respectively, and as faint as 6 to 10 times fainter than the total noise at 850 fim at 
the same redshifts. In the deep Spitzer fields, the detected 24 /xm sources up to z~2 contribute significantly to the 
submillimeter anisotropies. We show that the method provides excellent (using COSMOS 24 /xm data) to good (using 
SWIRE 24 /xm data) removal of the z < 2 (COSMOS) and z < 1 (SWIRE) anisotropies. 

Conclusions. Using this cleaning method, we then hope to have a set of large maps dominated by high redshift galaxies 
for galaxy evolution study (e.g., clustering, luminosity density). 

Key words, infrared: galaxies -galaxies: evolution - (cosmology:) large-scale structure of universe - Methods: statistical 



1. Introduction 

The first observational evidence of the cosm i c in frared 
background ( CIB) was reported b y [Puget et aL (1996) and 
confirmed by |Fixsen et al.| (1998) and Hauser et aL (1998). 
The CIB is composed of the relic emission at infrared 
wavelengths of the formation and evolution of galaxies and 
consists of contributions from infrared starburst galaxies 
and to a lesser degree from active galactic nuclei. Deep cos- 
mological surv eys of this background have been carried out 



with ISO (see [Genzel fc Cesarsky[ |2000[ 



reviews) mainly at 15 /im with ISOCAM (e.g. 



2005 



2002 
2001 



et a 



„ baz et al. 

; at 90 and 170 /im with ISOPHOT (e.g., [Dole etoT 
j with Spitzer at 24, 70, and 160 /im (e.g., Papovia 



2004 



Dole et al. 



instruments SCUBA 



[2004J and w ith ground-based 
Blain et ' 



Beelen et al., 2008) 



g., [Blain et aL| [2002] ), L ABOCA [Farrah et aT] ( [2QQ6| 
)8l) ,and MAMBO (e.g., |Bertoldi| measured a strong cr 



of the CIB and its sources (see Lagache et al. , 2005 , for 
a general review) but many questions remain unanswered 
such as the evolution of their spatial distribution with 
redshift. 



The spatial distribution of infrared galaxies as a 
function of redshift is a key component of the scenario 
of galaxy formation and evolution. However, its study 
has been hampered by high confusion and instrumental 
noise and/or by the small size of the fields of observation. 
Tentativ e studies, with a small number of sources at 
850 /im (Blain et al.( |2004[), found evidence of a relation- 
ship between submillimeter galaxies and the formation 
of massive ga l axies in de nse environments. Wo r ks by 



|et al.| |2000|) at 850, 870, and 1300 /im respectively. The 
balloon-borne experiment BLAST performed the first deep 
extragalactic surveys at wavelengths 250-500/im capable 



and 
lustering 



Magliocchetti et al.[ ( [2008 ) 
of ultra luminous infrarei 



galaxies (ULIRG) detected with Spitzer at high redshifts. 
Alternatively, the infrared background anisotropies could 
also provide information about the correlation between 
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their contributions to the CIB (Devlin et al. 


2009). These 


2000 Knox et al., 


2001 


1 'Amblard & Coorayl 2007p, and 


surveys allowed us to obtain a far clearer understanding 


its redshift evolution. | 


!jagache et al. (|2007| and Viero 



et al. (2009) reported the detection of a correlated compo- 



nent in the background anisotropies using Spitzer/MIPS 
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(160 /im) and BLAST (250, 350, and 500 /im) data. 
These authors found that star formation is highly biased 
at z>0.8. The strong evolution of the bias parameter 
with redshift, caused by the shifting of star formation to 
more massive halos with increasing redshift, infers that 
environmental effects influence the vigorous star formation. 

To improve our understanding of the formation and 
evolution of galaxies using CIB anisotropies, we need more 
information about the redshift of the sources contributing 
to the CIB. We also need a method that allows to go 
deeper than the confusion noise level. In this context, an 
invaluable tool is the stacking technique, which allows 
a statistical study of groups of sources that cannot be 
detected individually at a given wavelength. Its requires 
the knowledge of the positions of the sources being 
"stacked" as inferred from their individual detection at 
another wavelength. This knowledge is then used to stack 
the signal of the sources at the wavelength at which 
they cannot be detected individually. Since the signal of 
the sources increases with the number of sources N and 
the noise (if Gaussian) increases with a/TV, the signal- 
to-noise ratio will increase with \fN . For an additional 
description of the basics of sta ckin g techniques we ref er 
to for example [DQle et al.| ( |2QQ6l ) and |Marsden et al.| ( |2QQ9| ). 



(e.g. iBaldry et al"! |2008). 



Stacking was used to measure the contribution of 24 /im 
galaxi es to the background at 70 and 160 /im using MIPS 
data ( Dole et al.^ ,2006) . Contribution from galaxies down 
to 60 iiJy at 24 /im is at least 79% of the 24 /im, and 80% 
of the 70 and 160 /im backgrounds, respectively. At longer 
wavelengths studies used this technique to determine 
the contribution of populations selected in the near- and 
mid-infrared to the FIRB (far-infrared background) back- 

f 'und: 3.6 /im sele cted sources to the 850 /im background 
ang et al., 2006) and 8 /im and 24 /im sele cted sources 
to the 850 /im and 450 /im backgrou nds (|Dye et all 



2006| [Serjeant et al.[[2QQ8] ). Finally, Ma rsden et al.| ( |2009| 
measured total subniillimeter intensities associated with al 



24 /im sources that are consistent with 24 micron-selected 
galaxies generating the full intensity of the FIRB. Similar 
studies with Planck and Herschel will provide even more 
evidence about the nature of the FIRB sources. 

Theoretically, a stacking technique also could be used 
to study the mean SED ( spectral energy dist ribution) of 



the stacked sources (e.g., Zheng et al. , 2007). The main 
potential limitations would be caused by the errors in the 
redshifts of the sources and an insufficiently large number 
of sources to stack per redshift bin. The observation of 
sufficiently large fields to which the technique can be 
applied is now assured by the to Spitzer legacy surveys 
FIDEL, COSMOS, and SWIREQand Planck and Herschel 
surveys. Advances in the measurement of the redshift have 
also been accomplished, alt hough for very small fields for 



sources up to z ~ 2 (e.g. [Caputi et al. , 2006[ ), and for 



the larger COSMOS fields u p to z ^ 1.3 with very high 
accuracy (Ilbert et al.[|2009|). Future surveys are planned 
to measure the redshifts in larger fields such as the dark 
energy survey (DE^ or the GAMA spectroscopic survey 



The difficulties in separating the contribution to the 
signal coming from different redshifts have handicapped 
the study of CIB anisotropies. However, once the mean 
SEDs of infrared galaxies per redshift bin are obtained 
we can use this information to analyze CIB anisotropies. 
The SEDs obtained with the stacking technique can be 
used to "clean" the low-redshift anisotropies (or at least 
a significant part of them) from the CIB maps. This can 
be performed by subtracting the undetected low-redshift 
{z < 1 — 2) populations from the maps using their mean 
colors and thus build maps dominated by sources at higher 
redshifts. This also facilitates the study of the evolution 
of large-scale structures at high redshift by removing the 
noise coming from low redshifts. 

In this p aper, we use the simulat ions and catalogs 
presented in Fernandez-Conde et al. ([2008 to study 
the limitations of stacking techniques in CIB anisotropy 
analysis. We stack 24 /im sources detected with MIPS in 
Planck, Herschel, and SCUBA-2 simulated observations. 
The catalogs and maps were created for different levels of 
bias between the fluctuations of infrared galaxy emissivities 
and the dark matter density field. We use a bias h = 1.5, 
which is very close to that measured by ,Lagache et al.| 



( 12007 ). 



The paper is organized as follows. In Sect.|2j we explain 
the method used to study the capabilities of the stacking 
once the redshift of the sources is known. Section |3] details 
the elements that limit the accuracy of the stacking tech- 
nique. In Sect. [4j we test the technique for studying the 
mean SEDs of galaxies. In Sect.|5j the feasibility of using in- 
formation about the SEDs to clean the observations of low- 
redshift anisotropies is studied. The results are summarized 
in Sect.[6j Throughout this paper, the cosmological parame- 
ters are assumed to be = 0.71, I^a = 0.73, Vt^^ = 0.27. For 
the dark-matter linear clustering, we set the normalization 
to be <J8 = 0.8. 



2. Description of the method 



Dole et aT] (2006) considered every MIPS 24 /im source 
in selected fields with fluxes >60 /iJy and then sorted 
the 24 jam sources by decreasing flux at 24 /im (hereafter 
5*24 ). The sources were placed in 20 bins of increasing 
flux density. These bins were of equal logarithmic width 
AS2A/S2A ~ 0.15, except for the bin corresponding to 
the brightest flux, to take all the bright sources. They 
then corrected the average flux obtained by stacking 
each 6^24 bin fo r incom pleteness using the correction of 
Papovich et al. (2004). This allowed them to determine 
lower limits to the CIB at 70 /im and 160 /im, and to flnd 
the contribution from galaxies down to 60 /iJy at 24 jam 
to be at least 79% of the 24 /im, and 80% of the 70 and 
160 /im backgrounds. 

While these measurements of the total flux are useful 
for estimating the overall energy emitted by these pop- 
ulations (see also [Marsden et al.[ 2009 ) , it does little to 



^ http:/ /ssc. spitzer. caltech.edu/legacy/ 
^ http:/ /www. darkenergysurvey.org/ 



^ The simulations are publicly available at [http:/ /www.ias.u- 
[psud.fr / irgalaxies| 
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improve our knowledge of individual sources. To use the 
average flux efficiently we fiave to decrease tfie dispersion 
in tfie individual fluxes (at tfie long wavelengtfi) around 
tfie average flux of the population. We can do this by 
separating large populations of sources into smaller and 
more homogeneous SED populations. 

One of the main sources of flux dispersion is the mea- 
surement of the mean flux using galaxies with very different 
redshifts. The lack of accurate redshifts (up to z~2) across 
large fields has so far limited the use of detailed redshift 
information in stacking analysis. Because of this, the fluxes 
of sources with different SEDs are averaged together and 
the mean flux is a poor estimator of the fluxes of individ- 
ual sources. However advances in the measurement of the 
redshifts are expected in the coming years with the new 
generation of spectrosco pic and photometr ic redshift sur- 
veys such as GAM A (e.g. jBaldry etlI||2QQ8| ), (Big-)BOS^ 
DE^ We developed a niethod that assumes that redshifts 
are known and investigated the limitations of stacking tech- 
niques caused by the uncertainties in the redshifts. We as- 
sessed the dispersion in the fluxes of individual sources with 
different redshift errors and the influence of this dispersion 
on the quality of the results using our simulations since this 
information will not be available in the real observations. 



2.1. Stacking technique 

We used our simulations to study the limitations of the 
stacking technique using 24 fim MIPS sources in Planck, 
Herschel, and SCUBA-2 observations. The choice of this 
wavelength (24 fim) is motivated by several reasons. 
Firstly, 24 jiva is a good tracer of infrared galaxies (unlike 
e.g., near-infrared detections). Secondly, 24 /im-selected 
galaxies emit the bulk of the CIB up t o at least 500 fim. 



(|Dole et al.[ |2006i Marsden et al.[ [2009). Thirdly, 24 /im 
Spitzer observations provide large and deep surveys, with 
redshift distribution of its sources extending up to redshift 
z ^ 2.5. The schematic description of our stacking process 
follows. The only requirements are knowledge of both the 
redshifts of the sources and their fluxes at 24 /im. 

The detected sources at 24 /im will be characterized by 
two parameters S2A and z. We first remove from the long 
wavelength map (hereafter A map) the so urces detected in- 
dividually, us ing the criteria described in |Fernandez-Conde| 
et al. (2008). These sources are no longer considered in 
the discussion, so whenever we refer to sources we refer to 
those detected at 24 /im with 5*24 greater than the detec- 
tion threshold and not those detected individually in the 
A map. The sources are then distributed into redshift bins. 
The width of the redshift bins have to be optimized for each 
observation. These bins cover the redshift interval between 
z = and z = Zmaai^^ where ^max is chosen depending on 
the goals of the worl^ We stack independently the sources 
in each redshift slice. For the sources in a given redshift 
slice i {z^siice)^ process of detection is as follows: 



^ http:/ /www. sdss3.org/cosmology.php 
^ http:/ /www. darkenergysurvey.org/ 



We analyze the stacking up to Zmax — 2 since reliable esti- 
mates of the redshift up to that redshift are available (although 
over quite small areas). 



1. Firstly, we order the sources by decreasing 6*24 . We start 
by stacking in the A map the sub-images of the two 
sources with higher 6*24 (that have not been detected 
individually). Then we measure the signal-to-noise ra- 
tio of the resulting image. A detection is achieved when 
the signal-to-noise ratio is higher than a certain detec- 
tion threshold. This detection threshold is optimized for 
different observations. For the cases discussed in this pa- 
per, we use a detection threshold of three. If we do not 
achieve a detection we stack more sources (always se- 
lecting the next brighter sources at 24 /im^Q This is 
done until we attain the required signal-to-noise ratio. 

2. Once a detection is achieved, we assign to all sources 
stacked together a flux equal to the total flux measured 
in the stacked image divided by the number of sources. 

3. After detection, we restart the process starting from the 
brightest sources that we have not yet stacked. 

4. Sometimes the last (and therefore faintest) group of 
sources in the redshift slice is not successfully stacked 
by this algorithm because an insufficient number of faint 
sources remains to be stacked in this last iteration. To 
correct for this, we simply carry out the algorithm start- 
ing this time from the faintest sources and stacking pro- 
gressively brighter sources until we achieve a detection. 
Although in this procedure the last two mean flux bins 
are not independent, the consequences in terms of sys- 
tematic errors are negligible (since the sources affected 
are few, faint, and the relative error in the stacking is 
small). 

Once this process is complete we assign a mean "stacked" 
flux to every source of the redshift slice. The errors in the 
fluxes of the sources measured by stacking are computed 
to be the total nois e measured in the map (follow ing the 
method described in 'Fernandez-Conde et al. (2008)) multi- 
plied by \/N /N^ where N is the number of stacked sources. 
We repeat this process for all the redshift slices until we 
have a measurement of the flux at A for all the sources in 
the catalog. In the 3 dimensional space of 5*24 — z — Sx, we 
then have a set of points 5*^4 — z^^ — S^^ corresponding to 
different successful stackings. For each successful stacking, 
the coordinates in each of the three axes are the following: 

— S^^ : The mean Sa of the sources of the i^^ stacked 
population, where a is the reference wavelength (here 
24 /im). 

— z^^ : The mean redshift of the sources of the i^^ stacked 
population. 

— S^^' : The mean Sx found for the sources using the stack- 
ing technique for the i^^ stacked population. 



Redshift slice optimization: Our algorithm assumes that 
sources at similar z and of similar S24 have similar char- 
acteristics at other wavelengths. Our best option to avoid 
substantial variance in Sx between the stacked sources is 
to try to avoid stacking together sources of very different 
5*24 or z. In this context, the size of the redshift bins were 
empirically optimized to ensure that (1) our detections are 
of high signal-to-noise ratio, (2) we achieve successful de- 
tections in each redshift slices without having to stack to- 
gether sources of very different Sa (by more than a factor 



^ To decrease the computation time, we increase the number 
of sources to be stacked using a logarithmic step of dN/N = 1.5. 
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of three), and (3) the redshift shces are as thin as possi- 
ble while complying with the first conditions. The redshift 
slices are chosen differently for each observation to comply 
with these criteria. 



2.2. Color smoothing 

The algorithm discussed above is quite simplified because it 
assumes that all sources detected in the same redshift bin 
have the same color Sx/Sa- In contrast we would expect 
there to be a continuous variation of Sx/Sa with both Sa 
and z. Following this assumption allows us to interpolate 
values between detections at different Sa for each redshift 
slice. A more complicated means of correction is to smooth 
our predictions by interpolating through the grid formed 
by the set of points S^^ — z^^ — S^^ found with the stack- 
ing algorithm described above for the whole Sa — z plane. 
We do this with the IDL function TRIGRID, which given 
data points defined by the parameters S^^ — z^^ — S^^ and 
a triangulation of the planar set of points determined by 
5*^4 and z^^ returns a regular grid of interpolated val- 
ues. We tried both approaches and found that the differ- 
ences between the results for the two different smoothings is 
very small so from now on we use only the "S'a smoothing". 
Figure [T]a shows the fluxes at 350 /im (with 1.5 < z < 1.6) 
before and after the two dimensional smoothing. It shows 
the real fluxes of the sources (known from the simulations), 
the recovered fluxes using the smoothing technique, and 
the recovered fluxes without smoothing. We can see that 
the smoothing greatly improves the accuracy of the fluxes. 
After this correction, the results are in very good agreement 
with the input fluxes. 



HERSCHEL 1.5 < z < 1.6 



3. Limitations of the method 

We now test the limitations of the method related to the 
difficulties we expect to face when real data are analyzed 
(e.g., intrinsic dispersion in the colors of the sources, 
errors in the measurement of the fluxes and in redshift s, 
clustering To illustrate the limitations, in this section 
we use the simulations at 350 /im. We reached the same 
conclusions using other far-infrared and submillimeter 
wavelengths. The size of the redshift slices that divide the 
^24 — z space was chosen to be dz = 0.1; wider redshift 
slices would stack together sources with very different 
fluxes; smaller redshift slices led to too low signal-to- noise 
ratios. 

Two different Spitzer surveys are used, COSMOS and 
SWIRE. COSMOS is a deep observation with a complete- 
ness of -100% up to ^24 = 80/iJy ( iSanders et al.j |2007). 
It allows us to test the stacking of faint sources. COSMOS 
covers a smaller field than SWIRE (2 sq. deg. versus 50 
sq. deg.) hence its stacking measurements are less accu- 
rate for bright s ources. Thus we also use the much larger 



SWIRE survey ( [Lonsdale et al. , 2004), which is less deep 
(5*24 > 270 /iJy) out covers —25 times more arccj^ We ana- 
lyze the stacking of 24 /im sources for two study cases: ob- 
servations in the far-infrared with Herschel at 350 /im and 



^ The problems associated with errors in the measurement of 
5*24 are considered negligible (see Sanders et al. , 2007). 

And therefore should have times more signal-to- 

noise ratio for similar populations of sources. 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 
(mjy) 

HERSCHEL 1.5 < z < 1.6 




0.30 



0.35 



0.40 
S24 (mJy) 



0.45 



Figure 1. Top: Input fluxes of the sources in the redshift 
slice 1.5 < z < 1.6 (solid line) together with estimates 
of the fluxes of the sources using the smoothing technique 
(dashed line) and estimates of the fluxes of the sources with- 
out smoothing (diamond). Bottom: The same but zoomed 
for 0.3 mJy < S24 < 0A7mJy. 



(in the next section) observations in the submillimeter with 
Planck and SCUBA-2 at 850 /im. The characteristics of the 
Herschel/SPIRE, Planck/HFI, and SCUBA-2 observations 
are the following: 



Stacking in the COSMOS field: 

— Detection limit: ^'24 > 80 /iJy at 24 /im. 

— Size of the field: 2 sq. deg. 

— Linear bias: b = 1.5. 

— Type of observation with Herschel: 350/im "Deep" (with 
1(7=12.3 mJy). 

— Type of observation with SCUBA-2: 850/im (with 
lcr=lmJy). 

Stacking in the SWIRE fields: 

— Detection limit: 6*24 > 270 /iJy at 24 /im. 

— Size of the field: 50 sq. deg. 

— Linear bias: b = 1.5. 

— Type of observation with Herschel: 350 /im "deep" (with 
la=12.3 mJy). 

— Type of observation with SCUBA-2 and Planck: 850 /im 
(with 1(7=1 mJy and 1(7=46.7 mJy - see Table 4 from 



Fernandez-Conde et al. (2008)- respectively) 



3.1. Cold and starburst populations 

Figure |2] shows the histograms of the fluxes at 350 /im for 
a stacking box with 0.5 < z < 0.6 and 0.5 < 6*24 < 1 mJy. 
The main source of error in the estimate of the fluxes for 
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this case would not be the dispersion in either S2A or z 
but the presence of two different populations, which are in- 
distinguishable using observations at shorter wavelengths. 
These two populations are the starburst and the nor- 



350 /*m; 0.5 < z < 0.6; 0.5 < < 1 



(2003) 



mal (cold) populations described in Lagache et al _ 
Figure |3] shows the number of starburst and normal sources 
as a function of z for sources with 80 < 5*24 < 270 /iJy, 
0.27 < 5*24 < 1 mJy, and ^24 > 1 mJy. For the three afore 
mentioned cases, the cold sources are the dominant popu- 
lation for z < 0.8, z < 0.6, and z < 0.5 respectively. There 
are no effective ways of separating these two populations, 
and this will cause poor estimates of the mean colors of each 
population. This is particularly important when the num- 
ber of sources of each type is approximately equal. This is 
because we add together two populations of very different 
^350 (cold sources are in general brighter in the submillime- 
ter than starburst sources at the same redshifts and with 
similar 824)- When one of the populations dominates, this 
problem becomes negligible. 



3.2. Errors caused by intrinsic dispersion in colors 

Because of the lack of constraints on SEDs at long wave- 
lengths and their evolution with redshift, the Lag ache et al. 
(2004|) model does not take into account that galaxies of the 



same luminosity and redshift could have different values of 
Sx (apart from the distinction between normal and star- 
burst sources). To assess the effect of this dispersion, we 
introduce a random Gaussian error into the flux estimated 
with the stacking for each of the stacked sources. The er- 
rors that we make using this procedure are equivalent to 
those that we would make if we were to use a model with 
an intrinsic Gaussian dispersion in the Sx of the sources. 
This type of error does not affect the results for the mean 
of the sources but the average difference between this mean 
and the fluxes of the individual sources. We test the effect 
on our results for different levels of dispersion (measured 
in terms of the standard deviation in the dispersion com- 
pared to the mean flux of the sources). In Fig.|4j we can see 
the histograms of the errors for a dispersion of 0%, 10%, 
25% for all sources with 824 > 270 jiiJy. As expected, the 
figure illustrates how the histograms broaden with disper- 
sion. For a standard deviation in the errors of the fluxes 
associated with the stacking ast and a standard deviation 
associated with the fluxes aoisp ^the final standard devi- 
ation in our errors aTot would be aTot = \J^st ~^ ^Disp' 
We do not analyze other statistical representations of this 
effect (i.e., non-Gaussian intrinsic dispersion) since we do 
not have any strong observational constraints. 



3.3. Redshift uncertainty 

The effect of redshift errors are difficult to evaluate. This 
is because they combine with the non-linear k-correction, 
making the variation in Sx with z complex. In Sect. |4j we 
study the effect of redshift errors for two different relative 
errors ^ = 3% and ^ = 10%. 



3.4. Problematic areas of the S24-z space 

Figure [5] shows the errors in the estimate of the mean fluxes 
in the ^24 — z space for a 350 fim Herschel observation of 




Figure 2. Histogram of the fluxes at 350 jam for a stacking 
box with 0.5 < z < 0.6 and 0.5 < 5*24 < 1 mJy. The mean 
value of the sources is S330 ~ 17 mJy. The two different 
populations are the normal cold sources {left population) 
and the starburst sources (n^/it population). It is clear that 
the main cause of error in our flux measurement comes from 
us stacking together two different populations. As expected, 
we checked that reducing the redshift slice does not reduce 
the dispersion. 



0.08 mJy < S24 < 0.27 mJy 













0.27 mJy < S24 < 1 mJy 





Figure 3. Histograms of the number of cold (thin line) 
and starburst (thick line) sources per 24 /im flux bin (his- 
tograms are normalized to the higher number of sources 
per histogram). The problematic regions are those where 
both populations have similar number of galaxies. This is 
especially important for 0.6 < z < 0.7 and faint sources. 



Dispersion 0, 10, 25 % 




2.0 



Figure 4. Ratio of recovered fluxes from stacking (S^Iq^^) 
to input fluxes in the simulation (S'JfoO sources with 
5*24 >270/iJy, with no additional dispersion in the fluxes 
at 350 /im (thick solid line), and with 10% (dotted-dashed 
line) and 25% (dashed line) additional dispersion. 
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the COSMOS field witfi redsfiift errors ^ = 3% and ^ = 
10%. For tfie estimates of tfie fluxes, we can easily identify 
several problematic areas in tfie 6*24 — z space. Tfiese are: 
data points at very low redsfiifts < 0.1), tfie brigfitest 
sources because of small number statistics and the faintest 
sources because of flux errorj^ 



Low z: There are very few sources at z < 0.1. This prevents 
the stacking from achieving high signal-to- noise ratio levels. 
This translates into large errors in the measurement of the 
mean fluxes for sources with z < 0.1. 

Bright sources: These sources are rare and we are therefore 
unable to reach signal-to-noise ratios as good as for fainter 
sources. We expect the results for bright sources to be bet- 
ter when the stacking technique is applied to larger fleld s 
(for example using the WISE survey (Mainzer et al. , 2005[ )). 
We should keep this in mind when analyzing the results in 
our study cases. 



Faint sources: Another shortcoming of the method is that 
the smoothing techniques cannot be applied to sources 
fainter than the stacked flux of the faintest bin. The best 
solution is to assume for the last point given by the stacking 
that all the sources have the same color, which is equiva- 
lent to assuming that their color is the same as that of the 
sources that are slightly brighter than them. 

4. Application of the method 

We now verify the accuracy of the method with realistic 
simulations of observations including redshift errors and by 
using existing observations at 24 fim with Spitzer. 

4.1. Stacking Herschel data in the far-infrared: 350 fim 

We comment on the main issues and sources of error en- 
countered when stacking 24 jiva sources in Herschel/Spire 
observations at 350 /im and considering a detection thresh- 
old of 3 <j. We note that the difficulties faced by the stack- 
ing technique at 250 and 500 /im are similar. We use a 
division in the z axis with redshift slices of dz = 0.1. We 
analyze the results for two redshift errors, an optimistic one 
of ^ = 3% and a pessimistic one of ^ = 10%. This il- 
lustrates the degradation in the quality of the results with 
redshift error. 

Errors in individual recovered fluxes: Figure [6] shows the er- 
rors in the estimate of the fluxes of the sources with the 
stacking technique for redshifts < z < 1 and 1 < z < 2 
for an observation of the COSMOS fleld at 350 /im. Three 
different estimates are shown: one compiled using stacking 
without "smoothing" and two others created using two dif- 
ferent smoothing techniques (in either z or both z and 6*24, 



Note that the top right area with no data plotted corre- 
sponds to a region where they are no sources at 24 microns; 
note also that in color representations as in Fig. [5] small dif- 
ferences in estimated value can have a great visual impact due 
to the variation in colors. A mere 20% change in the estimate 
can change the color from green to red. The general variation is 
consistent with our detection threshold of 3a. 
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Figure 5. Accuracy in the mean recovered fluxes at 350 /im 
in a COSMOS-like observation when considering redshift 
errors of ^ = 3% (top) and ^ = 10% (bottom). The 
colors (shading) correspond to different values of the ac- 
curacy, while the vertical axis is 6*24 and the horizontal 
axis is the redshift bin. Left: relative errors in the mean 
recovered fluxes {S/S^iS''^ = {§1^^''^ - S^^l""^)/ S^^l""^) for ah 
the_5'24 — z space. Right:_the same but in absolute values 
|V6'f5^o''^^| = |(5'35o^^ -5'||^o'0/^35o''^| ^ud decreasing the 
dynamic range of the plot (0-0.5) to illustrate the errors 
more clearly. The 6*24 — z space is divided linearly in z and 
logarithmically in 6*24. Redshifts are given on the bottom- 
right flgure. Log (824) (in mJy) on the left flgures. 



cf. Sect. 2.2). The differences between the estimates ob- 



tained using the two smoothing techniques are quite small 
for most sources. The flgures show rather good agreement 
between the input values of the fluxes and those found by 
the stacking technique. The results improve for 2: > 1 com- 
pared to those at z < 1. This is because of the low signal- 
to-noise ratio at low z and the two-population problem. 
As expected, the results degrade with the redshift error. 
The results also improve when either of the two smooth- 
ing algorithms are used. Figure [7| shows the results for a 
SWIRE observation. The recovered fluxes are more accu- 
rate because the larger number of sources allows us to ob- 
tain higher signal-to-noise ratios for the stacking (but it is 
limited to ^24 > 270 /iJy). 

Limit for faint sources: Stacking in the COSMOS fleld al- 
lows the detection of sources as faint as 5*350 = 2.1 ± 0.7 
mJy at Z'^ 1, which is 6 times lower than the noise (1 a). At 
z~ 2, we achieve detections for sources with 5350 = 3 ± 1 
mJy or 4 times lower than the noise. This is equivalent 
to a gain in the signal-to-noise ratio of a factor of 18 and 
12, respectively, with respect to the Sa detection. If the 
Spitzer data were complete down to lower fluxes, we should 
be able to successfully detect those sources too. The stack- 
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COSMOS 0<z<1; S24> 80/xJy COSMOS 1<z<2; S24> 80/xJy 
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Figure 6. Relative errors in recovered fluxes for individual sources at 350 /im in a COSMOS-like observation for redshift 
errors ^ = 3% (top figures) and ^ = 10% (bottom figures). Left: for ^24 > 80 /iJy and redsfiifts < z < 1. Right: for 
6*24 > 80/iJy and 1 < z < 2. The zeros represents a perfect estimate. Three estimates are shown: direct values obtained 
with the stacking (dotted line); values obtained with the stacking and smoothed in z (thin solid line), and smoothed 
both in z and in 5*24 (thick solid line). 



SWIRE 0<z<1; S24> 270yuJy 



SWIRE 1<z<2; S24> 270yuJy 
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Figure 7. Same as Fig. [g] but for an observation at 350 /im if the SWIRE fields. 



ing method at 350 jam is limited by the Spitzer detection 
limits. 



Mean errors: The final results for the fluxes and colors of 
the sources obtained using the stacking technique are com- 
pared with the real (input) values in Figs. |8] and [oj They are 
in very good agreement with the input nuxes (called real 



fluxes in the figures) but to obtain a clearer idea of the er- 
rors we show on Fig. [To] two plots of the mean flux relative 
erroi[^per box of 6*24 — z. The left figure shows the rela- 
tive differences between our mean estimated flux (using the 



Note that the mean flux relative error is equivalent to the 
mean color relative error since there is no error in our S24 mea- 
surements. 
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Figure 8. Left: Real fluxes of the sources at 350 /im (in 
mJy) in the space ^24 — z. Right: fluxes found by the 
smoothed stacking technique. The colors correspond to dif- 
ferent values of the flux, while the vertical axis is S^^ and 
the horizontal axis is the redshift bin. The 5*24 — z space is 
divided linearly in z and logarithmically in 6*24 . 
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Figure 9. Left: Real S^^q/S2a flux ratio of the sources in 
the space 5*24 — z. Right: 83^0/824 flux ratio found by the 
smoothed stacking technique (right). The colors correspond 
to different values of the ratio, while the vertical axis is S24 
and the horizontal axis is the redshift bin. The S24 — Z space 
is divided linearly in z and logarithmically in 5*24. 

stacking technique) and the flux of the sources introduced 
in the model. Yellowish colors represent overestimates of 
the source fluxes compared to their input fluxes. Darker 
colors represent underestimates. The right flgure shows the 
same relative error but this time in absolute value. We can 
see that the larger errors, which can be as high as 50%, 
are made for sources at z < 0.1. This is because the small 
number of sources at these redshifts prevents the stacking 
from achieving sufficiently high signal-to-noise ratios. For 
the bulk of sources however, the errors in the mean flux are 
smaller than 10%. The errors associated with the problem 
of 2 populations cannot be illustrated by these flgures be- 
cause this problem does not affect the accuracy of the mean 
value found for a set of sources but the dispersion in the 
fluxes of individual sources around this mean value. 

4.2. Stacking Planck and SCUBA-2 data at 850 /am 

When applying the same technique to Planck observations 
at 850 /im, we encounter a fundamental limitation of the 




Figure 10. Same as Fig.|5|but with no redshift errors. 
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Figure 11. Lateral cut of a stacking image (Planck/HFI at 
850 jam) for very faint sources (^24 ~ 100 jiiJy). The total 
signal (normalized to 1 at the peak), the signal from both 
the clustering and the sources are the solid, dotted, and 
dashed lines, respectively. One pixel equals 25 arcsec. 



stacking technique. In the stacked image, we can discern 
two contributions to the peak, one associated with the 
stacked sources, which has the shape of the PSF, and 
another broader peak around it which is associated with 
the sources correlated with the stacked sources. The 
method works easily when the PSF width is much smaller 
than the width of the correlation peak. However, this 
condition is not fulfllled for Planck observations where 
the width of the correlation signal around sources is not 
very different from the width of the PSF. Furthermore, 
when stacking faint sources, 6*24 ~ 100 /i J?/, the signal 
associated with the correlations is much stronger than 
that of the sources: it becomes impossible to distinguish 
between the signal from the sources being stacked and 
the signal from the clustering. Figure [TT] shows a cut of a 
stacked image for very faint sources (5*24 ~ 100 jiiJy). The 
flgure shows the total signal, the signal coming from both 
the clustering and the sources. For these faint sources, we 
can see that the signal from the clustering of the sources 
is more important than that of the stacked sources and 
their FWHMs are very similar. Several attempts were 
made to correct this problem. By far the most effective 
solution is to use additional observations with a narrower 
PSF at similar wavelengths to estimate the fraction of the 
flux that is associated with the clustering. This method is 
described hereafter. Another possible solution that does 
not rely on complementary observations is presented in 
Appendix A. 
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The problem caused by the clustering contribution to 
the flux measured with Planck/HFI makes it difficult to use 
this instrument alone to estimate the fluxes accurately . It 
is therefore necessary to use observations with other instru- 
ments with smaller FWHM. In the far-infrared, we could 
use Herschel (for the same channel as Planck at 350 /urn). 
For the submillimeter observations, we will have to use 
ground-based submillimeter instruments (e.g., future cam- 
era SCUBA-2 at 850 /im or LABOCA at 870 /im). 

SCUBA-2 observation of the COSMOS field at 850 /im 

We analyze here the stacking of sources in the COSMOS 
field observed with SCUBA-2. SCUBA-2 wih have a 
very good sensitivity; we use an estimate of the noise for 
these observations of a = 1 mJy, close to that specified 
in the SCUBA-2 webpagj^ Because the signal of the 
sources at 850 /im is much fainter relative to the noise 
than with Herschel at 350 /im, we have to increase 
the size of the redshift bins to achieve detections. We 
take the following boundaries for the redshift slices 
0,0.1,0.4,0.8, 1., 1.2, 1.5, 1.8, an(i2. 2. We use the same 
detection threshold as that used for Herschel at 350 /im 

{Dthres — 3). 

Figure [12] shows the errors in the estimate of individual 
fluxes of 850 /im sources for 5*24 > 80 /iJy and redshifts 
I < z < 2 with redshift errors of ^=0% (top), 3% 
(middle), and 10% (bottom). The results are poorer than 
those at 350 /im (Fig. |6|. This is because the signal of the 
individual sources is weaker relative to the noise at 850 /im 
than at 350 /im. The results are clearly dependent on the 
redshift errors. 

The sources detected with the stacking technique at 
z ^ 1 are as faint as Ss^o = 0.10 ± 0.03 mJy, which is 
10 times smaller than the noise. At z ~ 2 we can achieve 
detections of sources with 5'85o = 0.17 ± 0.05 mJy, which 
is 6 times smaller than the noise. This is equivalent to 
a gain in the signal-to-noise ratio of a factor of 30 and 
18, respectively, with respect to the 3<j detection. As for 
350 /im the stacking method is limited by the Spitzer 
detection limit. 
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Figure 13 shows the errors in the estimated mean fluxes 
at 850 /im in the 5*24 — z space for a COSMOS observa- 
tion stacked with SCUBA-2 at 850 /im with redshift error 
^ = 3% before and after the "smoothing" correction. It 
shows the improvement of the accuracy with the "smooth- 



ing" correction. Figure 13 also shows the errors in the es- 
timate of the mean fluxes for ^ = 10% (smoothing ap- 
plied). As at 350 /im, we lose accuracy in our predictions 
when the redshift errors are higher. When comparing with 
observations at 350 /im, we see that our estimates are not 
as accurate, the mean errors at 850 /im being around 15% 
compared to 5-10% at 350 /im. The problems we discussed 
for 350 /im observations are yet greater at 850 /im. The 
problem at low redshift is far more important here because 
the sources at z < 0.9 are in general fainter than at higher 



http:/ /www .jach.hawaii.edu/JCMT/surveys/Cosmology.html 



Figure 12. Relative errors in recovered fluxes for individ- 
ual sources at 850 jam in a COSMOS-like observation for 
redshift errors ^ equal to (top), 3% (middle), and 10% 
(bottom), for 6*24 > 80 /iJy and 1 < z < 2. Three estimates 
are shown: direct values obtained with the stacking (dotted 
line); values obtained with the stacking and smoothed in z 
(thin solid line); and values smoothed both in z and in 5*24 
(thick solid line). 



Planck 850 /im 

The Planck observation is hindered by the clustering 
problem caused by its large PSF (5'), rendering its flux 
estimates completely useless unless a correction is applied. 
The problem is clearly illustrated in Fig. [l4j where we 
show the histograms of the ratio of the flux estimates to 
the input fluxes for a Planck observation of the SWIRE 
fields for two selected redshift bins. We developed a simple 
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Figure 13. Accuracy of the mean recovered fluxes at 
850 /im in a COSMOS-like observation wfien considering 
a redsfiift error of ^ = 3% and no smoothing (top), a 
redshift error of ^ = 3% and the smoothing (middle) and 

a redshift error of ^ = 10% and the smoothing (bottom 
figures). The colors (shading) correspond to different val- 
ues of the accuracy, while the vertical axis is 5*24 and the 
horizontal axis is the redshift bin. The ^24 — z space is 
divided linearly in z and logarithmically in 6*24 . Redshifts 
are given on the right figures, log(S2^ (in mJy) on the left 
figures. Left figures show the relative errors on the mean 
recovered fluxes iySll^""^ = {Siig''^ - Sgl""^) / Sgl""^) for ah 
the_5'24 — z space. Right figures show the absolute values 

lyioStackl IfoStack QReal\ / oReall 

1^^850 I — |W850 ^850 J/ ^8 



■^850 



method to correct this problem. 

When stacking sources in a given redshift bin with 
Planck, we measure the added contribution of the sources 
and the clustering. To correct the stacked fluxes with 
Planck for the effects of clustering, we use source fluxes 
at 850 jam obtained by stacking SCUBA-2 data. If we stack 
sources detected by Planck for which we have an estimate 
of their fluxes inferred from SCUBA-2 data, we can obtain 
the contribution of the clustering in the Planck stacking by 
calculating the difference between the total measured flux 
and that measured in the SCUBA-2 stacking. For each red- 
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800 - 




0.5 1.0 1.5 2.0 2.5 3.0 

ostock / oreal 
-^850 / -^850 



0.5 1.0 1.5 2.0 2.5 3.0 

ostock / oreol 
-^850 / 



Figure 14. Ratio of recovered to input fluxes of individual 
sources at 850 /im for an observation of the SWIRE fields, 
redshift errors ^=10%, and two redshift bins, < z < 1 
(left) and 1 < z < 2 (right). Three estimates are shown: di- 
rect values obtained with the stacking (dotted line); values 
obtained with the stacking and smoothed in z (thin solid 
line); and values smoothed both in z and in 5*24 (thick solid 
line). The value 1 represents a perfect measurement. The 
recovered fluxes have not been corrected from the clustering 
and are thus highly overestimated. 



m 



Figure 15. Relative error in the mean recovered fluxes at 
850 /im {VSiiS''^ = {SiiS''^ - Sgl''^)/Sgl''^) for a Planck 
observations of the SWIRE fields with ^ = 10% before 
(left) and after (right) correcting from the clustering. The 
colors (shading) correspond to different values of the rela- 
tive error, while the vertical axis is 6*24 and the horizontal 
axis is the redshift bin. 



shift bin, we therefore stack Planck data for all the sources 
in a SWIRE observation with fluxes 0.27 < ^24 < 1 mJy. 
We do not use the brighter sources because their flux es- 
timates are poorer. Once we have estimated the effect of 
the clustering for different redshift bins, we can correct 
the fluxes found with Planck. Figure 15 shows the effect 
of applying this correction. We can see that the results are 
greatly improved. After the correction, the results for the 
bright sources 5*24 > 1 mJy are indeed superior for Planck 
than with SCUBA-2, because of its larger sky coverage. We 
note that the correction is assumed to be the same inside 
a redshift bin for all /S24. 



4.3. Combination of different observations 

4.3.1. Observations in the far-infrared (350 /im) 

We analyzed the Herschel observation of the COSMOS and 
SWIRE fields. We have seen that the SWIRE stacking is 
more accurate when estimating the flux of the brightest 
sources. Figure [16] shows the flux estimates at 350 /im when 
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Figure 16. Combined observations at 350 /im with red- 
shift errors ^ = 3% (top figures) and ^ = 10% 
(bottom figures). Le^t: relative errors o^the estimate of 

space. Right: absolute values 



the mean fluxes {VS^iS''^ 
the 5*24 
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to different values of the error, while the vertical axis is 5*24 
and the horizontal axis is the redshift bin. 



we combine the strengths of both observations. For sources 
with ^24 < 0.27 mJy, we have only COSMOS estimates, 
which are therefore compelled to use. Since we know that 
the SWIRE observations have higher signal-to-noise ratios 
than COSMOS observations at high fluxes, we chose to use 
these estimates for sources with 5*24 > 0.34 mJy. For the 
fainter sources stacked in SWIRE 0.27 < ^24 < 0.34 mJy 
data, we obtained errors larger than those of COSMOS 
since we assume that the colors of the faintest sources are 
as described in Sect. |3.4[ For these sources, the COSMOS 
estimates have therefore to be used. 



4.3.2. Observations in the submi Hi meter (850 /im) 

As performed at 350 /im, we analyzed the 
COSMOS/SCUBA-2 and SWIRE/Planck observations 
separately and we now combine their respective strengths. 
Figure 17 shows the error estimates for these combined 
observations. For faint sources with 6*24 < 0.27 mJy, we 
use COSMOS/SCUBA-2. For the faintest sources stacked 
in SWIRE (0.27 < S24 < 1 mJy), it is more accurate to 
use COSMOS/SCUBA-2 than Planck measurements due 




Figure 17. Same as Fig. 16 for combined observations at 
850 /im. 



to the errors induced by the uncertainty in the clustering 
contribution. For brighter sources (^24 > 1 mJy), the 
corrected Planck estimations are more accurate than 
those of SCUBA-2 and we prefer to use them. Figure [T7| 
shows the relative errors in the mean recovered fluxes with 
respect to the input fluxes at 850 /im, when combining 
both observations. They are typically of the order of 15% 
for ^ = 3%. 

4.3.3. Observations at other wavelengths 

For observations in the far-infrared and because of the 
issues discussed in Sect. |4.2| and lower typical noise level, 
the stacking technique produces more accurate estimates 
of the fluxes with Herschel than with Planck, although the 
latter has the advantage of covering the entire sky. We did 
not present separately the Herschel observations at 250 /im 
or 500 /im since the analysis of the results at these two 
wavelengths are similar to those for 350 /im observations. 
At 550 /im, a wavelength where there is a Planck but 
not a Herschel channel, it is more advisable to use the 
values found by Herschel at 500 /im after applying a small 
correction than to use the Planck values. At 850 /im, we 
combined the Planck observations with those of SCUBA-2 
although other submillimeter data (e.g., LABOCA) could 
have been used. At 1380 /im (Planck/HFI 217 GHz), we 
tested the same approach using MAMBO/IRAM simu- 
lated observations to complement the Planck observations. 
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Figure 18. True mean fluxes (diamonds) compared to tfie 
mean fluxes found by stacking (triangles) witfi tfie esti- 
mated errors at 70, 160, 250, 350, 500, and 850 /im of 
the 800 faintest sources above 80 /iJy (mean fluxes ^24 = 
110 /iJy and 5*24 = 135 /iJy at redshifts 1 < z < 1.1 and 
2 < z < 2.1, respectively) in our simulated COSMOS obser- 
vation. The points at 70 and 160 /im are from a Spitzer sim- 
ulation with instrumental noise taken from S anders et al.l 
( |2007D . T he solid lines show t he Starburst SEDs from the 
library of'Lagache et al. ( 2004 ) that has the same mean 5*24 
at those two redshifts. 



obtaining similar results as for 850 /im. 

The complete mean SEDs for the different populations 
can provide information about the mean galaxy proper- 
ties, such as star- formation rate and dust content. Figure 
18] shows our measurements at 70, 160, 250, 350, 500, and 
850 /im of the flux of the 800 faintest sources detected in 
our simulated COSMOS survey at 1 < 2: < 1.1 and at 
2 < z < 2.1 relative to both their true fluxes and the SED 
of a typical source at these fluxes and redshifts. The largest 
errors are found at 70 /im, 160 /im, and 850 /im. For both 
redshifts, the errors in our estimates are smaller than 10 %. 
The same method could be applied to fainter populations, if 
they were detected individually with Spitzer. As mentioned 
before, the limitation of the method is the detection limit 
of the Spitzer observations at 24 /im. 



5. Cleaning maps of undetected source populations 

5.1. Contribution to the CIB 

An obvious application of the results provided by the stack- 
ing technique is the measurement of the total energy emit- 
ted by different galaxy populations at wavelengths where 
they can not be seen directly. This would give us the CIB 
fraction at those wavelengths coming from the chosen pop- 
ulation. We compare the total contribution from sources 
brighter than 5*24 = 80 /iJy at redshifts z < 2 in our simu- 
lations with that determined using the stacking technique, 
and obtain very similar results. At 350 /im, we find (us- 
ing our stacking estimates) that these sources account for 
35.4% and 35.8% of the CIB when the redshift errors are 
3% and 10%, respectively. This is a 0.4% and 0.8% over- 
estimate of their contribution (35%) to the CIB of the un- 
derlying model. At 850 /im, we estimate that these sources 
account for 19% and 20% of the CIB when the redshift er- 
rors are 3% and 10%, respectively, which is a slight 2 — 3% 
overestimate of their contribution (17%) to the CIB in the 
model. 
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Figure 19. Power spectra of two maps in which we placed 
the sources with either their input fluxes from the simula- 
tions (dotted line) or their stacked fluxes (dashed line). The 
results are shown for a SWIRE observation (left figures) and 
COSMOS observation (right figures) at 350 /im for stacked 
sources up to 2: = 2 with redshift errors ^ = 3% (top) and 
^ = 10% (bottom). 
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Figure 20. Same as Fig. 19 at 850 /im 



5.2. Removing anisotropies due to low-z infrared galaxies 

A more sophisticated use of the present results is the 
statistical removal of the contribution of these populations 
at long wavelengths. If we accurately extract a sufficiently 
large fraction of the background anisotropies at low z^ this 
will allow us to study the CIB anisotropies at high z. For 
the first time, we could then separate the contributions 
to the CIB anisotropies at different redshifts. This would 
allow us to study large-scale structures at high redshift. 

To remove from the observed maps the contribution 
of sources up to a certain redshift, we create a map of 
sources for whose fluxes were estimated using the stacking 
technique. We subtract this map from the observed maps, 
which is equivalent to individually subtracting all the 
stacked sources. We estimate the source fluxes from the 



12 



N. Fernandez-Conde: Stacking analysis and study of CIB anisotropies 



colors obtained by combining the different observations, 
as described in Sect. |4.3[ However, we know tfiat tfie flux 
estimates fiave significant errors for very brigfit sources 
and sources at redsfiifts z < 0.1 at 350 fim and z < 0.8 
at 850 jam. These errors will affect the accuracy of our 
removal of the low-z background anisotropies. We also 
studied the effect of a Gaussian dispersion in the fluxes 
of the sources (as described in Sect. 2.2) on the power 
spectra. For dispersions as high as 25%, the results are 
equivalent with and without dispersion. This is because of 
the large number of sources contributing to each bin. 

To assess the importance of these errors, we compare 
the map compiled using the flux estimates by stacking 
with a second map where these sources have their true in- 
put fluxes. Comparing the power spectrum of both maps 
gives the accuracy of the anisotropy estimates for the flrst 
map. Figure [19] shows the two power spectra at 350 fim for 
sources ai z < 2 for both a SWIRE observation (with 5*24 > 
270 /iJy) and a COSMOS observation (with ^'24 > 80 /iJy) 
and for two redshift errors ^ = 3% and ^ = 10%. At 
350 /im, the accuracy of our estimation is superior to 0.5% 
for both the correlated and Poissonian part of the spectrum 
in both the SWIRE and COSMOS observations in the case 
of a small redshift error (^ = 3%). When the redshift er- 
ror is greater, our estimate of the Poissonian noise increases 
moderately with mean errors of 3%. Figure 20 shows the 
same result at 850 /im. Because of the small redshift error in 
the COSMOS survey, we overestimate the correlated part 
by 40% and the Poissonian part by 24%. For larger redshift 
errors, our overestimates increase to 60% and 50% of the 
correlated and Poissonian part, respectively. In this case, 
this shows the importance of accurate redshifts. The differ- 
ences in the overestimates of the Poissonian and correlated 
part are caused by the populations contributing to these 
two regimes not being exactly the same, bright sources con- 
tributing more in relative terms to the Poissonian fluctua- 
tions than to the correlated part. 
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Figure 23. Power spectra of the 850 jam map of the SWIRE 
flelds. The solid line is the total CIB power spectrum, the 
dashed line is the CIB power spectrum for z > zum (where 
ziim is a redshift limit), and the dotted line is the power 
spectrum of the total CIB from which we have subtracted 
the stacked sources at z < zum- The redshift limit zum is 
ziim = 1.5 (left) and zum = 2 (right). The redshift errors 
are ^ = 3% (top) and ^ = 10% (bottom). 



of the smaller size of the fleld, we do not have access to the 
largest scales that we were able to analyze with SWIRE. 
We subtract approximately ~ 99% of the correlated part 
and ~ 90% of the Poissonian for the small redshift error. 
For the large redshift error, these fractions become ^ 85% 
and ~ 90% of the correlated and Poissonian parts, respec- 
tively. For each of the considered redshift limits, the power 
spectrum of the residual left after our subtraction of the 
z < ziim stacked source is in close agreement with the 
power spectrum at high redshifts {z > zum)- This remains 
true when we consider a large redshift error. 



5.3. High-redshift power spectra of CIB anisotropies 



5.3.2. Observations at 850 /im 



5.3.1. Observations at 350 /im 

After analyzing the accuracy of the map that we intend 
to subtract, we investigate our capabilities to subtract a 
signiflcant part of the background anisotropies for different 
redshift limits. Figure [21] compares the power spectra 
of the total background anisotropies to those at z > 1, 
z > 1.5, and z > 2 in a SWIRE observation. It also 
shows the power spectra of the map of CIB anisotropies 
from which we have subtracted the z < 1, z < 1.5, and 
z < 2 contribution, which were estimated by stacking. 
Since our subtraction is rather accurate, the very small 
difference between these last two sets of power spectra is 
caused by us not subtracting all the sources but only those 
above 6*24 > 270 /iJy. We subtract approximately half 
the correlated part {k < 8 deg~^) and two thirds of the 
Poissonian part {k > 8 deg~^) independently of redshift 
errors. 



Figure [22] shows the same results for a COSMOS obser- 
vation. We have the positions of sources with S24 > 80 /iJy 
which allows us to subtract a larger fraction of the back- 
ground than in the SWIRE survey. Unfortunately because 



Figures [23] and [24] show the similar results but at 850 jam. 
For these observations, we needed to use COSMOS data 
because for SWIRE data we do not subtract a signiflcant 
fraction of the CIB anisotropies. In terms of power spectra, 
we are able with SWIRE to subtract only ~ 30% of the cor- 
related part and ^ 50% of the Poissonian part. In the case 
of COSMOS, we subtract approximately ~ 75% of both 
the correlated and Poissonian part of the power spectra. 
Figure 24 (top-right) shows that, for errors of ^ = 3%, our 



methooTs very efflcient in subtracting z < 2 anisotropies. 



6. Summary 

We have described a stacking algorithm and illustrated 
its capabilities using Spitzer observations. We have stud- 
ied the accuracy of the stacking method as a means 
of determining the average fluxes of classes of unde- 
tectable sources at long wavelengths. The results show 
that the technique will be capable of measuring accurate 
fluxes at both far-infrared and submillimeter wavelgnths 
for sources as faint as 80 /iJy at 24 /im using average colors. 
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Swire 350 /xm; z < 1 ^ Swire 550 /xm; z < 1.5 g^i^g 350 ^m; z < 2 




Figure 21. Power spectra of the map for a SWIRE observation at 350 /am. The sohd hne is the total power spectrum of 
the background, the dashed line is the power spectrum of the background for z > zum (where zum is a redshift limit), 
and the dotted line is the power spectrum of the total background from which we have subtracted the stacked sources at 
z < ziim- The redshift limit zum is zum = 1 (left figures), zum = 1.5 (middle figures), and zum = 2 (right figures). The 
redshift errors are ^ = 3% (top) and ^ = 10% (bottom). 




With the successful commissioning of the Planck and 
Herschel missions, large maps (even all-sky for Planck) 
from 250 /im to the millimeter wavelength range are now 
available. SCUBA-2 and other submillimeter cameras (e.g., 
LABOCA) will provide data of higher angular resolution 
in the submillimeter. We have applied the stacking method 
to the Herschel, Planck, and SCUBA-2 simulated data and 
measured the full average SED of populations of sources 
detected at 24 /im. The strong variation in the 824/ Sx 



color with redshift requires us to define the populations 
to which the method will be applied not only in ranges of 
5*24 but also in terms of (photometric) redshift. We show 
we are able to measure the mean flux of populations 4 to 
6 times fainter than the total noise at 350 /im at redshifts 
z = 1 and z = 2, respectively, and 6 to 10 times fainter 
than the total noise at 850 /im, at the same redshifts. We 
have been able to reproduce the SED at wavelengths 70, 
160, 250, 350, 500, and 850 /im of a population of sources 
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Cosmos 850 /xm; z < 1 .5 



Cosmos 850 yum; z < 2 




k (deg"') 



Figure 24. Same as Fig. 23 but for the COSMOS field. 



witfi mean flux S2A = 0.11 mJy and S24 = 0.135 mJy at 
redsfiifts z = 1 and z = 2, respectively. 

In tfie deep Spitzer fields, tfie detected 24 fim sources 
constitute a large fraction of the anisotropies. We have 
shown that the method presented in this paper enables an 
excellent (350-850 COSMOS) to good (350-850 SWIRE) 
removal of both the Poissonian and correlated low-z 
anisotropies. The relative contribution of sources to the 
background anisotropies up to z = 2 decreases with 
wavelength in the model. This property is expected to 
remain valid independently of the details of the model from 
250 /im to the millimeter range. Although the accuracy of 
the subtracted map is lower at 850 /im, the cleaning of the 
power spectrum is quite effective (because the contribution 
of the low-redshift sources is small at these submillimeter 
wavelengths) . 

The same technique could also be used to remove from 
the observations all the contributions from sources for 
which we have estimated a flux, to decrease the confusion 
noise caused by infrared galaxies. This would be interesting 
for the detection of other types of sources (for example, SZ 
sources in Planck data). 

The method allows us to build z > 1 — 2 CIB maps 
from the submillimeter to the millimeter. We have found 
that the method can also be successfully applied at the 
other Herschel and Planck wavelengths than those tested 
in this paper. The longer wavelengths at which this can 
be achieve will depend on the success of the component 
separation and not on the removal of the z < 2 sources. 
We can then hope to have a set of large CIB maps dom- 
inated by high-redshift galaxies. This set of CIB maps at 
different wavelengths dominated hy z > 2 sources will be a 
powerful tool for studying the evolution of the large-scale 
structure of infrared galaxies. The effect of the K-correction 
ensures that each of these maps (at different wavelengths) 
are dominated by particular high-redshift ranges. Methods 
of independent component separation based on the corre- 



lation matrix between these maps (e.g., Del abrouille et al. 
2003 ) should allow us to extract maps and power spectra for 



a number of redshift ranges equal to the number of maps. 
This last step will fulfill the main objective of this work. It 



will allow the study of the evolution of the IR galaxy clus- 
tering at high redshifts by means of the power spectrum 
analysis of CIB anisotropies. These maps may also be used 
to help us understand the contribution of high-z IR galaxies 
both to the CIB and the star-formation history. 
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Appendix A: Alternative correction for the 

clustering contribution to the stacked fluxes in 
Planck maps 

We developed an alternative method for correcting the pho- 
tometry of a group of stacked sources for the effects of the 
clustering. If we consider that the signal measured for a 
population of stacked sources at a given wavelength is the 
combination of the signal originating from the sources and 
from the clustering, we can write the measured flux as: 



^measured ^sources _|_ ^clustering _|_ ^ 



(A.1) 



where S^^^^^^^^ is the total measured signal, ^^^^^^es 
is the part of the signal coming from the sources, and 

^clustering ^^^^ ^.^^^^ 

coming from the sources 
correlated with the detected sources that we are stacking, 
and (J is the noise. 

If two populations of sources have very similar fluxes 
at the wavelength of detection (24 jim) and are situated 
at similar redshifts, we can assume that their sources have 
very similar physical characteristics and hence their colors 
S\/S2A are very similar. In this case, we can write: 



Qsources 
^sources 



Qsources 
^24 



^sources 



(A.2) 



where the A and B subscripts represent the first and second 
population of sources. We can measure the total flux (from 
the sources and the clustering) for the stacking of both 
source populations and express them as: 



^gtotal gsources _|_ ^clustering _|_ 



( 1^^^^^^ (^sources 



^clustering _|_ ^ 



(A.3) 

(A.4) 



If we were to assume that the contribution of the corre- 
lated sources to the flux is the same for both populations 
^gciustering^^ = (^^^-^--^)^^ as cxpected for sources 
with similar spatial distributions, and that the noise is 
negligible, we would have a system of three equations with 
three unknowns that we can solve. 

The main problem for the applicability of this method 
is that we need to stack many sources to ensure that the 
noise becomes negligible compared to the signal. Because 
of this, it is preferable to combine an observation whose 
photometry is aff'ected by the clustering with another ob- 
servation for which this problem does not exist, as illus- 
trated by our present analysis. If the photometry of this 
second observation is affected by smaller errors (as it is the 
case of SCUBA-2 data relative to Planck data at 850 /im), 
the results will be improved by combining the two obser- 
vations. However, the method discussed in this appendix 
is applicable to cases where we do not have an alternative 
observation with which we can correct from the clustering 
problem. 
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